Overview

Dataset statistics

Number of variables51
Number of observations9999
Missing cells160562
Missing cells (%)31.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.9 MiB
Average record size in memory408.0 B

Variable types

NUM38
CAT8
BOOL5

Warnings

TransactionDT is highly correlated with TransactionIDHigh correlation
TransactionID is highly correlated with TransactionDTHigh correlation
C2 is highly correlated with C1 and 3 other fieldsHigh correlation
C1 is highly correlated with C2 and 3 other fieldsHigh correlation
C6 is highly correlated with C1 and 3 other fieldsHigh correlation
C8 is highly correlated with C4 and 1 other fieldsHigh correlation
C4 is highly correlated with C8 and 1 other fieldsHigh correlation
C9 is highly correlated with C5 and 1 other fieldsHigh correlation
C5 is highly correlated with C9 and 1 other fieldsHigh correlation
C10 is highly correlated with C4 and 1 other fieldsHigh correlation
C11 is highly correlated with C1 and 3 other fieldsHigh correlation
C13 is highly correlated with C5 and 1 other fieldsHigh correlation
C14 is highly correlated with C1 and 3 other fieldsHigh correlation
D2 is highly correlated with D1High correlation
D1 is highly correlated with D2High correlation
D6 is highly correlated with D4 and 1 other fieldsHigh correlation
D4 is highly correlated with D6 and 1 other fieldsHigh correlation
D7 is highly correlated with D5High correlation
D5 is highly correlated with D7High correlation
D12 is highly correlated with D4 and 1 other fieldsHigh correlation
card2 has 119 (1.2%) missing values Missing
addr2 has 913 (9.1%) missing values Missing
dist1 has 6614 (66.1%) missing values Missing
dist2 has 9621 (96.2%) missing values Missing
P_emaildomain has 2105 (21.1%) missing values Missing
R_emaildomain has 8374 (83.7%) missing values Missing
D2 has 4576 (45.8%) missing values Missing
D3 has 4272 (42.7%) missing values Missing
D4 has 6230 (62.3%) missing values Missing
D5 has 7231 (72.3%) missing values Missing
D6 has 9508 (95.1%) missing values Missing
D7 has 9776 (97.8%) missing values Missing
D8 has 8899 (89.0%) missing values Missing
D9 has 8899 (89.0%) missing values Missing
D10 has 1265 (12.7%) missing values Missing
D11 has 7723 (77.2%) missing values Missing
D12 has 9593 (95.9%) missing values Missing
D13 has 9719 (97.2%) missing values Missing
D14 has 9549 (95.5%) missing values Missing
D15 has 4930 (49.3%) missing values Missing
M1 has 5790 (57.9%) missing values Missing
M2 has 5790 (57.9%) missing values Missing
M3 has 5790 (57.9%) missing values Missing
M4 has 4886 (48.9%) missing values Missing
M5 has 5793 (57.9%) missing values Missing
M6 has 2572 (25.7%) missing values Missing
C3 is highly skewed (γ1 = 29.7433854) Skewed
C4 is highly skewed (γ1 = 31.1993148) Skewed
C7 is highly skewed (γ1 = 24.06917497) Skewed
C8 is highly skewed (γ1 = 29.37530263) Skewed
C10 is highly skewed (γ1 = 27.48382957) Skewed
C12 is highly skewed (γ1 = 28.31673412) Skewed
TransactionID has unique values Unique
dist1 has 313 (3.1%) zeros Zeros
C3 has 9909 (99.1%) zeros Zeros
C4 has 8355 (83.6%) zeros Zeros
C5 has 6013 (60.1%) zeros Zeros
C6 has 1014 (10.1%) zeros Zeros
C7 has 9095 (91.0%) zeros Zeros
C8 has 7812 (78.1%) zeros Zeros
C9 has 2981 (29.8%) zeros Zeros
C10 has 7905 (79.1%) zeros Zeros
C12 has 9088 (90.9%) zeros Zeros
C13 has 486 (4.9%) zeros Zeros
C14 has 492 (4.9%) zeros Zeros
D1 has 4523 (45.2%) zeros Zeros
D2 has 221 (2.2%) zeros Zeros
D3 has 1320 (13.2%) zeros Zeros
D4 has 1104 (11.0%) zeros Zeros
D5 has 492 (4.9%) zeros Zeros
D6 has 314 (3.1%) zeros Zeros
D10 has 3394 (33.9%) zeros Zeros
D11 has 667 (6.7%) zeros Zeros
D12 has 275 (2.8%) zeros Zeros
D13 has 231 (2.3%) zeros Zeros
D14 has 341 (3.4%) zeros Zeros
D15 has 1184 (11.8%) zeros Zeros

Reproduction

Analysis started2020-10-17 09:58:55.894873
Analysis finished2020-10-17 10:02:16.326452
Duration3 minutes and 20.43 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

TransactionID
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct9999
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2991999
Minimum2987000
Maximum2996998
Zeros0
Zeros (%)0.0%
Memory size78.1 KiB

Quantile statistics

Minimum2987000
5-th percentile2987499.9
Q12989499.5
median2991999
Q32994498.5
95-th percentile2996498.1
Maximum2996998
Range9998
Interquartile range (IQR)4999

Descriptive statistics

Standard deviation2886.607005
Coefficient of variation (CV)0.0009647753909
Kurtosis-1.2
Mean2991999
Median Absolute Deviation (MAD)2500
Skewness0
Sum2.9916998e+10
Variance8332500
MonotocityStrictly increasing
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
29880291< 0.1%
 
29873601< 0.1%
 
29955641< 0.1%
 
29894171< 0.1%
 
29873681< 0.1%
 
29935111< 0.1%
 
29914621< 0.1%
 
29955561< 0.1%
 
29894091< 0.1%
 
29935031< 0.1%
 
Other values (9989)998999.9%
 
ValueCountFrequency (%) 
29870001< 0.1%
 
29870011< 0.1%
 
29870021< 0.1%
 
29870031< 0.1%
 
29870041< 0.1%
 
ValueCountFrequency (%) 
29969981< 0.1%
 
29969971< 0.1%
 
29969961< 0.1%
 
29969951< 0.1%
 
29969941< 0.1%
 

isFraud
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.1 KiB
0
9734 
1
 
265
ValueCountFrequency (%) 
0973497.3%
 
12652.7%
 

TransactionDT
Real number (ℝ≥0)

HIGH CORRELATION

Distinct9659
Distinct (%)96.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean186896.8196
Minimum86400
Maximum313110
Zeros0
Zeros (%)0.0%
Memory size78.1 KiB

Quantile statistics

Minimum86400
5-th percentile94865.5
Q1146625.5
median171642
Q3240106
95-th percentile272421.2
Maximum313110
Range226710
Interquartile range (IQR)93480.5

Descriptive statistics

Standard deviation56561.41691
Coefficient of variation (CV)0.3026344538
Kurtosis-0.9648609127
Mean186896.8196
Median Absolute Deviation (MAD)41973
Skewness0.2088639233
Sum1868781299
Variance3199193883
MonotocityIncreasing
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
2471993< 0.1%
 
2459283< 0.1%
 
2555873< 0.1%
 
2436743< 0.1%
 
1485373< 0.1%
 
1502233< 0.1%
 
1598763< 0.1%
 
1722243< 0.1%
 
1548982< 0.1%
 
1704602< 0.1%
 
Other values (9649)997199.7%
 
ValueCountFrequency (%) 
864001< 0.1%
 
864011< 0.1%
 
864691< 0.1%
 
864991< 0.1%
 
865061< 0.1%
 
ValueCountFrequency (%) 
3131101< 0.1%
 
3130991< 0.1%
 
3130681< 0.1%
 
3130631< 0.1%
 
3130581< 0.1%
 

TransactionAmt
Real number (ℝ≥0)

Distinct1390
Distinct (%)13.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean131.5394191
Minimum1.896
Maximum3247.91
Zeros0
Zeros (%)0.0%
Memory size78.1 KiB

Quantile statistics

Minimum1.896
5-th percentile20
Q144
median74.95
Q3131.116
95-th percentile425
Maximum3247.91
Range3246.014
Interquartile range (IQR)87.116

Descriptive statistics

Standard deviation215.1460698
Coefficient of variation (CV)1.635601489
Kurtosis69.10557669
Mean131.5394191
Median Absolute Deviation (MAD)37.95
Skewness6.975465212
Sum1315262.652
Variance46287.83134
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
506586.6%
 
1006536.5%
 
593693.7%
 
1173253.3%
 
107.952742.7%
 
252572.6%
 
57.952392.4%
 
402252.3%
 
1502172.2%
 
2002052.1%
 
Other values (1380)657765.8%
 
ValueCountFrequency (%) 
1.8961< 0.1%
 
2.3261< 0.1%
 
2.3562< 0.1%
 
2.5382< 0.1%
 
2.8021< 0.1%
 
ValueCountFrequency (%) 
3247.911< 0.1%
 
3162.954< 0.1%
 
30002< 0.1%
 
2948.951< 0.1%
 
2907.951< 0.1%
 

ProductCD
Categorical

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size78.1 KiB
W
7708 
C
906 
H
871 
R
 
305
S
 
209
ValueCountFrequency (%) 
W770877.1%
 
C9069.1%
 
H8718.7%
 
R3053.1%
 
S2092.1%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

card1
Real number (ℝ≥0)

Distinct2057
Distinct (%)20.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9826.242024
Minimum1011
Maximum18390
Zeros0
Zeros (%)0.0%
Memory size78.1 KiB

Quantile statistics

Minimum1011
5-th percentile2115
Q16386.5
median9500
Q313780
95-th percentile17197
Maximum18390
Range17379
Interquartile range (IQR)7393.5

Descriptive statistics

Standard deviation4776.491455
Coefficient of variation (CV)0.4860954415
Kurtosis-1.039414826
Mean9826.242024
Median Absolute Deviation (MAD)3608
Skewness0.000981831413
Sum98252594
Variance22814870.62
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
79199109.1%
 
95002232.2%
 
171881771.8%
 
158851451.5%
 
125441261.3%
 
126951041.0%
 
28031031.0%
 
18132961.0%
 
11207961.0%
 
15066951.0%
 
Other values (2047)792479.2%
 
ValueCountFrequency (%) 
10111< 0.1%
 
10302< 0.1%
 
10331< 0.1%
 
1039110.1%
 
10471< 0.1%
 
ValueCountFrequency (%) 
183901< 0.1%
 
183871< 0.1%
 
1837580.1%
 
1837070.1%
 
183664< 0.1%
 

card2
Real number (ℝ≥0)

MISSING

Distinct418
Distinct (%)4.2%
Missing119
Missing (%)1.2%
Infinite0
Infinite (%)0.0%
Mean349.2927126
Minimum100
Maximum600
Zeros0
Zeros (%)0.0%
Memory size78.1 KiB

Quantile statistics

Minimum100
5-th percentile111
Q1194
median327
Q3500
95-th percentile567
Maximum600
Range500
Interquartile range (IQR)306

Descriptive statistics

Standard deviation157.767231
Coefficient of variation (CV)0.4516762743
Kurtosis-1.407941225
Mean349.2927126
Median Absolute Deviation (MAD)153
Skewness-0.04216555324
Sum3451012
Variance24890.49919
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1949389.4%
 
3218428.4%
 
1117467.5%
 
5557007.0%
 
4906086.1%
 
5833023.0%
 
1702592.6%
 
5142252.3%
 
3602152.2%
 
5452082.1%
 
Other values (408)483748.4%
 
ValueCountFrequency (%) 
1001301.3%
 
1011< 0.1%
 
1022< 0.1%
 
103460.5%
 
104250.3%
 
ValueCountFrequency (%) 
6002< 0.1%
 
5993< 0.1%
 
5982< 0.1%
 
5962< 0.1%
 
595180.2%
 

card3
Real number (ℝ≥0)

Distinct26
Distinct (%)0.3%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean152.7342468
Minimum100
Maximum225
Zeros0
Zeros (%)0.0%
Memory size78.1 KiB

Quantile statistics

Minimum100
5-th percentile150
Q1150
median150
Q3150
95-th percentile185
Maximum225
Range125
Interquartile range (IQR)0

Descriptive statistics

Standard deviation10.19588526
Coefficient of variation (CV)0.0667557242
Kurtosis8.758784179
Mean152.7342468
Median Absolute Deviation (MAD)0
Skewness2.634991341
Sum1527037
Variance103.9560762
MonotocityNot monotonic
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%) 
150908690.9%
 
1857917.9%
 
143170.2%
 
146160.2%
 
117140.1%
 
214110.1%
 
144100.1%
 
10270.1%
 
21350.1%
 
10050.1%
 
Other values (16)360.4%
 
ValueCountFrequency (%) 
10050.1%
 
10270.1%
 
10650.1%
 
117140.1%
 
1194< 0.1%
 
ValueCountFrequency (%) 
2251< 0.1%
 
214110.1%
 
21350.1%
 
2101< 0.1%
 
2031< 0.1%
 

card4
Categorical

Distinct4
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Memory size78.1 KiB
visa
6208 
mastercard
3586 
american express
 
112
discover
 
92
ValueCountFrequency (%) 
visa620862.1%
 
mastercard358635.9%
 
american express1121.1%
 
discover920.9%
 
(Missing)1< 0.1%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length16
Median length4
Mean length6.322932293
Min length3

card5
Real number (ℝ≥0)

Distinct45
Distinct (%)0.5%
Missing22
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean200.3194347
Minimum100
Maximum237
Zeros0
Zeros (%)0.0%
Memory size78.1 KiB

Quantile statistics

Minimum100
5-th percentile117
Q1166
median226
Q3226
95-th percentile226
Maximum237
Range137
Interquartile range (IQR)60

Descriptive statistics

Standard deviation39.08679011
Coefficient of variation (CV)0.1951223064
Kurtosis0.1853782589
Mean200.3194347
Median Absolute Deviation (MAD)2
Skewness-1.266370642
Sum1998587
Variance1527.777161
MonotocityNot monotonic
Histogram with fixed size bins (bins=45)
ValueCountFrequency (%) 
226493749.4%
 
224132613.3%
 
166128712.9%
 
1174534.5%
 
1023833.8%
 
2023743.7%
 
1382522.5%
 
1952062.1%
 
2191661.7%
 
1371591.6%
 
Other values (35)4344.3%
 
ValueCountFrequency (%) 
10070.1%
 
1023833.8%
 
1174534.5%
 
11860.1%
 
1261291.3%
 
ValueCountFrequency (%) 
2371< 0.1%
 
236120.1%
 
229380.4%
 
2283< 0.1%
 
226493749.4%
 

card6
Categorical

Distinct2
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Memory size78.1 KiB
debit
7890 
credit
2108 
ValueCountFrequency (%) 
debit789078.9%
 
credit210821.1%
 
(Missing)1< 0.1%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length6
Median length5
Mean length5.210621062
Min length3

addr2
Categorical

MISSING

Distinct2
Distinct (%)< 0.1%
Missing913
Missing (%)9.1%
Memory size78.1 KiB
87
9079 
96
 
7
ValueCountFrequency (%) 
87907990.8%
 
9670.1%
 
(Missing)9139.1%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length4
Median length4
Mean length3.908690869
Min length3

dist1
Real number (ℝ≥0)

MISSING
ZEROS

Distinct467
Distinct (%)13.8%
Missing6614
Missing (%)66.1%
Infinite0
Infinite (%)0.0%
Mean115.9190547
Minimum0
Maximum4474
Zeros313
Zeros (%)3.1%
Memory size78.1 KiB

Quantile statistics

Minimum0
5-th percentile0
Q13
median8
Q326
95-th percentile776.2
Maximum4474
Range4474
Interquartile range (IQR)23

Descriptive statistics

Standard deviation345.1165133
Coefficient of variation (CV)2.977219874
Kurtosis25.50442355
Mean115.9190547
Median Absolute Deviation (MAD)7
Skewness4.5601552
Sum392386
Variance119105.4077
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
03133.1%
 
22392.4%
 
12382.4%
 
31831.8%
 
41811.8%
 
51771.8%
 
61381.4%
 
71241.2%
 
91091.1%
 
111051.1%
 
Other values (457)157815.8%
 
(Missing)661466.1%
 
ValueCountFrequency (%) 
03133.1%
 
12382.4%
 
22392.4%
 
31831.8%
 
41811.8%
 
ValueCountFrequency (%) 
44741< 0.1%
 
27511< 0.1%
 
26291< 0.1%
 
24631< 0.1%
 
24571< 0.1%
 

dist2
Real number (ℝ≥0)

MISSING

Distinct138
Distinct (%)36.5%
Missing9621
Missing (%)96.2%
Infinite0
Infinite (%)0.0%
Mean290.4814815
Minimum0
Maximum5521
Zeros36
Zeros (%)0.4%
Memory size78.1 KiB

Quantile statistics

Minimum0
5-th percentile0
Q17
median37.5
Q3285
95-th percentile1528
Maximum5521
Range5521
Interquartile range (IQR)278

Descriptive statistics

Standard deviation618.451258
Coefficient of variation (CV)2.129055714
Kurtosis21.82914019
Mean290.4814815
Median Absolute Deviation (MAD)36.5
Skewness3.974775956
Sum109802
Variance382481.9585
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
7590.6%
 
0360.4%
 
25130.1%
 
1120.1%
 
1528110.1%
 
2100.1%
 
4080.1%
 
670.1%
 
63170.1%
 
6660.1%
 
Other values (128)2092.1%
 
(Missing)962196.2%
 
ValueCountFrequency (%) 
0360.4%
 
1120.1%
 
2100.1%
 
31< 0.1%
 
460.1%
 
ValueCountFrequency (%) 
55211< 0.1%
 
46351< 0.1%
 
25272< 0.1%
 
25172< 0.1%
 
23952< 0.1%
 

P_emaildomain
Categorical

MISSING

Distinct50
Distinct (%)0.6%
Missing2105
Missing (%)21.1%
Memory size78.1 KiB
gmail.com
3677 
yahoo.com
1726 
hotmail.com
633 
anonymous.com
489 
aol.com
418 
Other values (45)
951 
ValueCountFrequency (%) 
gmail.com367736.8%
 
yahoo.com172617.3%
 
hotmail.com6336.3%
 
anonymous.com4894.9%
 
aol.com4184.2%
 
comcast.net1531.5%
 
icloud.com981.0%
 
outlook.com840.8%
 
msn.com590.6%
 
att.net530.5%
 
Other values (40)5045.0%
 
(Missing)210521.1%
 
Frequencies of value counts

Unique

Unique5 ?
Unique (%)0.1%
Histogram of lengths of the category

Length

Max length16
Median length9
Mean length8.083908391
Min length3

R_emaildomain
Categorical

MISSING

Distinct39
Distinct (%)2.4%
Missing8374
Missing (%)83.7%
Memory size78.1 KiB
gmail.com
642 
hotmail.com
346 
anonymous.com
233 
yahoo.com
124 
outlook.com
 
64
Other values (34)
216 
ValueCountFrequency (%) 
gmail.com6426.4%
 
hotmail.com3463.5%
 
anonymous.com2332.3%
 
yahoo.com1241.2%
 
outlook.com640.6%
 
comcast.net310.3%
 
yahoo.com.mx230.2%
 
aol.com220.2%
 
icloud.com180.2%
 
live.com.mx160.2%
 
Other values (29)1061.1%
 
(Missing)837483.7%
 
Frequencies of value counts

Unique

Unique8 ?
Unique (%)0.5%
Histogram of lengths of the category

Length

Max length16
Median length3
Mean length4.166616662
Min length3

C1
Real number (ℝ≥0)

HIGH CORRELATION

Distinct179
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.405540554
Minimum0
Maximum735
Zeros7
Zeros (%)0.1%
Memory size78.1 KiB

Quantile statistics

Minimum0
5-th percentile1
Q11
median1
Q33
95-th percentile24.1
Maximum735
Range735
Interquartile range (IQR)2

Descriptive statistics

Standard deviation34.57578561
Coefficient of variation (CV)4.113451762
Kurtosis201.0025069
Mean8.405540554
Median Absolute Deviation (MAD)0
Skewness11.39847376
Sum84047
Variance1195.48495
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1527552.8%
 
2176017.6%
 
38808.8%
 
45065.1%
 
52832.8%
 
61671.7%
 
71401.4%
 
81151.2%
 
9700.7%
 
10620.6%
 
Other values (169)7417.4%
 
ValueCountFrequency (%) 
070.1%
 
1527552.8%
 
2176017.6%
 
38808.8%
 
45065.1%
 
ValueCountFrequency (%) 
7353< 0.1%
 
73450.1%
 
7332< 0.1%
 
2551< 0.1%
 
2541< 0.1%
 

C2
Real number (ℝ≥0)

HIGH CORRELATION

Distinct168
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.935493549
Minimum0
Maximum808
Zeros3
Zeros (%)< 0.1%
Memory size78.1 KiB

Quantile statistics

Minimum0
5-th percentile1
Q11
median1
Q33
95-th percentile25
Maximum808
Range808
Interquartile range (IQR)2

Descriptive statistics

Standard deviation34.40959086
Coefficient of variation (CV)4.336162666
Kurtosis292.739206
Mean7.935493549
Median Absolute Deviation (MAD)0
Skewness14.0793394
Sum79347
Variance1184.019943
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1540854.1%
 
2168316.8%
 
38648.6%
 
44644.6%
 
53283.3%
 
61711.7%
 
71451.5%
 
8680.7%
 
9630.6%
 
10620.6%
 
Other values (158)7437.4%
 
ValueCountFrequency (%) 
03< 0.1%
 
1540854.1%
 
2168316.8%
 
38648.6%
 
44644.6%
 
ValueCountFrequency (%) 
8083< 0.1%
 
8071< 0.1%
 
8051< 0.1%
 
8044< 0.1%
 
8031< 0.1%
 

C3
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.0101010101
Minimum0
Maximum8
Zeros9909
Zeros (%)99.1%
Memory size78.1 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum8
Range8
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.1288460324
Coefficient of variation (CV)12.7557572
Kurtosis1547.278931
Mean0.0101010101
Median Absolute Deviation (MAD)0
Skewness29.7433854
Sum101
Variance0.01660130006
MonotocityNot monotonic
Histogram with fixed size bins (bins=5)
ValueCountFrequency (%) 
0990999.1%
 
1860.9%
 
22< 0.1%
 
31< 0.1%
 
81< 0.1%
 
ValueCountFrequency (%) 
0990999.1%
 
1860.9%
 
22< 0.1%
 
31< 0.1%
 
81< 0.1%
 
ValueCountFrequency (%) 
81< 0.1%
 
31< 0.1%
 
22< 0.1%
 
1860.9%
 
0990999.1%
 

C4
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED
ZEROS

Distinct28
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.8094809481
Minimum0
Maximum538
Zeros8355
Zeros (%)83.6%
Memory size78.1 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum538
Range538
Interquartile range (IQR)0

Descriptive statistics

Standard deviation17.01490839
Coefficient of variation (CV)21.01952916
Kurtosis978.882104
Mean0.8094809481
Median Absolute Deviation (MAD)0
Skewness31.1993148
Sum8094
Variance289.5071075
MonotocityNot monotonic
Histogram with fixed size bins (bins=28)
ValueCountFrequency (%) 
0835583.6%
 
1134613.5%
 
21641.6%
 
3520.5%
 
4200.2%
 
5200.2%
 
53650.1%
 
2150.1%
 
750.1%
 
5373< 0.1%
 
Other values (18)240.2%
 
ValueCountFrequency (%) 
0835583.6%
 
1134613.5%
 
21641.6%
 
3520.5%
 
4200.2%
 
ValueCountFrequency (%) 
5381< 0.1%
 
5373< 0.1%
 
53650.1%
 
5341< 0.1%
 
482< 0.1%
 

C5
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct165
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.702770277
Minimum0
Maximum278
Zeros6013
Zeros (%)60.1%
Memory size78.1 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile11
Maximum278
Range278
Interquartile range (IQR)1

Descriptive statistics

Standard deviation24.88580969
Coefficient of variation (CV)4.363810654
Kurtosis30.80285308
Mean5.702770277
Median Absolute Deviation (MAD)0
Skewness5.423403315
Sum57022
Variance619.303524
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0601360.1%
 
1223122.3%
 
26296.3%
 
32472.5%
 
41171.2%
 
5800.8%
 
6470.5%
 
7430.4%
 
10370.4%
 
8340.3%
 
Other values (155)5215.2%
 
ValueCountFrequency (%) 
0601360.1%
 
1223122.3%
 
26296.3%
 
32472.5%
 
41171.2%
 
ValueCountFrequency (%) 
2781< 0.1%
 
2701< 0.1%
 
2244< 0.1%
 
2201< 0.1%
 
2191< 0.1%
 

C6
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct143
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.02970297
Minimum0
Maximum551
Zeros1014
Zeros (%)10.1%
Memory size78.1 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median1
Q32
95-th percentile16
Maximum551
Range551
Interquartile range (IQR)1

Descriptive statistics

Standard deviation25.7728979
Coefficient of variation (CV)4.274322969
Kurtosis209.5784877
Mean6.02970297
Median Absolute Deviation (MAD)0
Skewness11.82946576
Sum60291
Variance664.2422663
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1558055.8%
 
2140514.1%
 
0101410.1%
 
35705.7%
 
43413.4%
 
51851.9%
 
61161.2%
 
7850.9%
 
10450.5%
 
9400.4%
 
Other values (133)6186.2%
 
ValueCountFrequency (%) 
0101410.1%
 
1558055.8%
 
2140514.1%
 
35705.7%
 
43413.4%
 
ValueCountFrequency (%) 
5511< 0.1%
 
5503< 0.1%
 
54950.1%
 
5481< 0.1%
 
2881< 0.1%
 

C7
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct21
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1707170717
Minimum0
Maximum48
Zeros9095
Zeros (%)91.0%
Memory size78.1 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum48
Range48
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.423995839
Coefficient of variation (CV)8.341262092
Kurtosis676.2536671
Mean0.1707170717
Median Absolute Deviation (MAD)0
Skewness24.06917497
Sum1707
Variance2.027764149
MonotocityNot monotonic
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%) 
0909591.0%
 
17257.3%
 
21001.0%
 
4210.2%
 
3210.2%
 
5130.1%
 
750.1%
 
482< 0.1%
 
332< 0.1%
 
272< 0.1%
 
Other values (11)130.1%
 
ValueCountFrequency (%) 
0909591.0%
 
17257.3%
 
21001.0%
 
3210.2%
 
4210.2%
 
ValueCountFrequency (%) 
482< 0.1%
 
471< 0.1%
 
461< 0.1%
 
341< 0.1%
 
332< 0.1%
 

C8
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED
ZEROS

Distinct40
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.8198819882
Minimum0
Maximum352
Zeros7812
Zeros (%)78.1%
Memory size78.1 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum352
Range352
Interquartile range (IQR)0

Descriptive statistics

Standard deviation11.36235369
Coefficient of variation (CV)13.85852336
Kurtosis896.7872274
Mean0.8198819882
Median Absolute Deviation (MAD)0
Skewness29.37530263
Sum8198
Variance129.1030814
MonotocityNot monotonic
Histogram with fixed size bins (bins=40)
ValueCountFrequency (%) 
0781278.1%
 
1172717.3%
 
21861.9%
 
3890.9%
 
4340.3%
 
6230.2%
 
5190.2%
 
8150.2%
 
7120.1%
 
990.1%
 
Other values (30)730.7%
 
ValueCountFrequency (%) 
0781278.1%
 
1172717.3%
 
21861.9%
 
3890.9%
 
4340.3%
 
ValueCountFrequency (%) 
3523< 0.1%
 
3511< 0.1%
 
35050.1%
 
3491< 0.1%
 
781< 0.1%
 

C9
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct118
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.374137414
Minimum0
Maximum166
Zeros2981
Zeros (%)29.8%
Memory size78.1 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile11
Maximum166
Range166
Interquartile range (IQR)2

Descriptive statistics

Standard deviation15.16700177
Coefficient of variation (CV)3.467426909
Kurtosis31.38910063
Mean4.374137414
Median Absolute Deviation (MAD)1
Skewness5.389115298
Sum43737
Variance230.0379428
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1396139.6%
 
0298129.8%
 
2129713.0%
 
35195.2%
 
42642.6%
 
51571.6%
 
61171.2%
 
7750.8%
 
10400.4%
 
9340.3%
 
Other values (108)5545.5%
 
ValueCountFrequency (%) 
0298129.8%
 
1396139.6%
 
2129713.0%
 
35195.2%
 
42642.6%
 
ValueCountFrequency (%) 
1661< 0.1%
 
1541< 0.1%
 
1521< 0.1%
 
1511< 0.1%
 
1501< 0.1%
 

C10
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED
ZEROS

Distinct39
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.7923792379
Minimum0
Maximum303
Zeros7905
Zeros (%)79.1%
Memory size78.1 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum303
Range303
Interquartile range (IQR)0

Descriptive statistics

Standard deviation10.04997007
Coefficient of variation (CV)12.68328293
Kurtosis805.652576
Mean0.7923792379
Median Absolute Deviation (MAD)0
Skewness27.48382957
Sum7923
Variance101.0018983
MonotocityNot monotonic
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%) 
0790579.1%
 
1166216.6%
 
21621.6%
 
3890.9%
 
4240.2%
 
9180.2%
 
8170.2%
 
6150.2%
 
5150.2%
 
7140.1%
 
Other values (29)780.8%
 
ValueCountFrequency (%) 
0790579.1%
 
1166216.6%
 
21621.6%
 
3890.9%
 
4240.2%
 
ValueCountFrequency (%) 
3033< 0.1%
 
3021< 0.1%
 
30150.1%
 
3001< 0.1%
 
1041< 0.1%
 

C11
Real number (ℝ≥0)

HIGH CORRELATION

Distinct155
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.084808481
Minimum0
Maximum578
Zeros5
Zeros (%)0.1%
Memory size78.1 KiB

Quantile statistics

Minimum0
5-th percentile1
Q11
median1
Q32
95-th percentile14
Maximum578
Range578
Interquartile range (IQR)1

Descriptive statistics

Standard deviation26.00767364
Coefficient of variation (CV)4.274197572
Kurtosis238.3171536
Mean6.084808481
Median Absolute Deviation (MAD)0
Skewness12.60464683
Sum60842
Variance676.3990881
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1654365.4%
 
2150215.0%
 
35895.9%
 
43053.1%
 
51571.6%
 
6971.0%
 
7790.8%
 
8610.6%
 
9460.5%
 
10440.4%
 
Other values (145)5765.8%
 
ValueCountFrequency (%) 
050.1%
 
1654365.4%
 
2150215.0%
 
35895.9%
 
43053.1%
 
ValueCountFrequency (%) 
5781< 0.1%
 
5773< 0.1%
 
57650.1%
 
5741< 0.1%
 
2171< 0.1%
 

C12
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct25
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.2281228123
Minimum0
Maximum84
Zeros9088
Zeros (%)90.9%
Memory size78.1 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum84
Range84
Interquartile range (IQR)0

Descriptive statistics

Standard deviation2.055263975
Coefficient of variation (CV)9.009462731
Kurtosis1008.213231
Mean0.2281228123
Median Absolute Deviation (MAD)0
Skewness28.31673412
Sum2281
Variance4.224110009
MonotocityNot monotonic
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%) 
0908890.9%
 
16206.2%
 
21491.5%
 
3530.5%
 
4200.2%
 
5110.1%
 
1090.1%
 
1370.1%
 
770.1%
 
2250.1%
 
Other values (15)300.3%
 
ValueCountFrequency (%) 
0908890.9%
 
16206.2%
 
21491.5%
 
3530.5%
 
4200.2%
 
ValueCountFrequency (%) 
841< 0.1%
 
832< 0.1%
 
711< 0.1%
 
422< 0.1%
 
371< 0.1%
 

C13
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct383
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29.26862686
Minimum0
Maximum852
Zeros486
Zeros (%)4.9%
Memory size78.1 KiB

Quantile statistics

Minimum0
5-th percentile1
Q11
median3
Q313
95-th percentile105.1
Maximum852
Range852
Interquartile range (IQR)12

Descriptive statistics

Standard deviation96.73879225
Coefficient of variation (CV)3.30520433
Kurtosis23.66554559
Mean29.26862686
Median Absolute Deviation (MAD)2
Skewness4.814828956
Sum292657
Variance9358.393926
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1321532.2%
 
29829.8%
 
35645.6%
 
04864.9%
 
43743.7%
 
53713.7%
 
62402.4%
 
72122.1%
 
82032.0%
 
91921.9%
 
Other values (373)316031.6%
 
ValueCountFrequency (%) 
04864.9%
 
1321532.2%
 
29829.8%
 
35645.6%
 
43743.7%
 
ValueCountFrequency (%) 
8521< 0.1%
 
8181< 0.1%
 
8081< 0.1%
 
8071< 0.1%
 
8061< 0.1%
 

C14
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct141
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.424042404
Minimum0
Maximum369
Zeros492
Zeros (%)4.9%
Memory size78.1 KiB

Quantile statistics

Minimum0
5-th percentile1
Q11
median1
Q32
95-th percentile14
Maximum369
Range369
Interquartile range (IQR)1

Descriptive statistics

Standard deviation23.23258825
Coefficient of variation (CV)3.616506054
Kurtosis72.36099184
Mean6.424042404
Median Absolute Deviation (MAD)0
Skewness7.109266092
Sum64234
Variance539.7531567
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1538153.8%
 
2163316.3%
 
37937.9%
 
04924.9%
 
44064.1%
 
52612.6%
 
61721.7%
 
71321.3%
 
8520.5%
 
11440.4%
 
Other values (131)6336.3%
 
ValueCountFrequency (%) 
04924.9%
 
1538153.8%
 
2163316.3%
 
37937.9%
 
44064.1%
 
ValueCountFrequency (%) 
3692< 0.1%
 
3681< 0.1%
 
36760.1%
 
3661< 0.1%
 
1901< 0.1%
 

D1
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct495
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean96.69946995
Minimum0
Maximum634
Zeros4523
Zeros (%)45.2%
Memory size78.1 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median10
Q3151
95-th percentile455
Maximum634
Range634
Interquartile range (IQR)151

Descriptive statistics

Standard deviation145.9789579
Coefficient of variation (CV)1.509614872
Kurtosis0.954184274
Mean96.69946995
Median Absolute Deviation (MAD)10
Skewness1.487578094
Sum966898
Variance21309.85616
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0452345.2%
 
11371.4%
 
30740.7%
 
2620.6%
 
28610.6%
 
29590.6%
 
91590.6%
 
3550.6%
 
14540.5%
 
180510.5%
 
Other values (485)486448.6%
 
ValueCountFrequency (%) 
0452345.2%
 
11371.4%
 
2620.6%
 
3550.6%
 
4430.4%
 
ValueCountFrequency (%) 
6341< 0.1%
 
6321< 0.1%
 
6262< 0.1%
 
6251< 0.1%
 
6041< 0.1%
 

D2
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
ZEROS

Distinct493
Distinct (%)9.1%
Missing4576
Missing (%)45.8%
Infinite0
Infinite (%)0.0%
Mean168.8340402
Minimum0
Maximum634
Zeros221
Zeros (%)2.2%
Memory size78.1 KiB

Quantile statistics

Minimum0
5-th percentile1
Q131
median115
Q3284
95-th percentile473
Maximum634
Range634
Interquartile range (IQR)253

Descriptive statistics

Standard deviation157.4188535
Coefficient of variation (CV)0.9323881209
Kurtosis-0.7598172869
Mean168.8340402
Median Absolute Deviation (MAD)97
Skewness0.7495539176
Sum915587
Variance24780.69543
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
02212.2%
 
11151.2%
 
30680.7%
 
91600.6%
 
29560.6%
 
28540.5%
 
3500.5%
 
14460.5%
 
59460.5%
 
2450.5%
 
Other values (483)466246.6%
 
(Missing)457645.8%
 
ValueCountFrequency (%) 
02212.2%
 
11151.2%
 
2450.5%
 
3500.5%
 
4340.3%
 
ValueCountFrequency (%) 
6341< 0.1%
 
6321< 0.1%
 
6251< 0.1%
 
6041< 0.1%
 
589110.1%
 

D3
Real number (ℝ≥0)

MISSING
ZEROS

Distinct290
Distinct (%)5.1%
Missing4272
Missing (%)42.7%
Infinite0
Infinite (%)0.0%
Mean28.15295966
Minimum0
Maximum487
Zeros1320
Zeros (%)13.2%
Memory size78.1 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median11
Q329
95-th percentile122
Maximum487
Range487
Interquartile range (IQR)28

Descriptive statistics

Standard deviation54.77897036
Coefficient of variation (CV)1.9457624
Kurtosis20.89249309
Mean28.15295966
Median Absolute Deviation (MAD)11
Skewness4.15686288
Sum161232
Variance3000.735593
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0132013.2%
 
13353.4%
 
302372.4%
 
282202.2%
 
21781.8%
 
141511.5%
 
291491.5%
 
71451.5%
 
31411.4%
 
41391.4%
 
Other values (280)271227.1%
 
(Missing)427242.7%
 
ValueCountFrequency (%) 
0132013.2%
 
13353.4%
 
21781.8%
 
31411.4%
 
41391.4%
 
ValueCountFrequency (%) 
4871< 0.1%
 
4851< 0.1%
 
4741< 0.1%
 
4731< 0.1%
 
4592< 0.1%
 

D4
Real number (ℝ)

HIGH CORRELATION
MISSING
ZEROS

Distinct484
Distinct (%)12.8%
Missing6230
Missing (%)62.3%
Infinite0
Infinite (%)0.0%
Mean167.7264526
Minimum-122
Maximum657
Zeros1104
Zeros (%)11.0%
Memory size78.1 KiB

Quantile statistics

Minimum-122
5-th percentile0
Q10
median88
Q3338
95-th percentile479
Maximum657
Range779
Interquartile range (IQR)338

Descriptive statistics

Standard deviation179.4981895
Coefficient of variation (CV)1.07018414
Kurtosis-1.220139624
Mean167.7264526
Median Absolute Deviation (MAD)88
Skewness0.6025907887
Sum632161
Variance32219.60005
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0110411.0%
 
456350.4%
 
1300.3%
 
479270.3%
 
30270.3%
 
483270.3%
 
485260.3%
 
28260.3%
 
480250.3%
 
481250.3%
 
Other values (474)241724.2%
 
(Missing)623062.3%
 
ValueCountFrequency (%) 
-1221< 0.1%
 
-901< 0.1%
 
-831< 0.1%
 
-151< 0.1%
 
-21< 0.1%
 
ValueCountFrequency (%) 
6571< 0.1%
 
5811< 0.1%
 
5471< 0.1%
 
4991< 0.1%
 
4951< 0.1%
 

D5
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
ZEROS

Distinct259
Distinct (%)9.4%
Missing7231
Missing (%)72.3%
Infinite0
Infinite (%)0.0%
Mean35.04985549
Minimum0
Maximum475
Zeros492
Zeros (%)4.9%
Memory size78.1 KiB

Quantile statistics

Minimum0
5-th percentile0
Q12
median13
Q330
95-th percentile170.95
Maximum475
Range475
Interquartile range (IQR)28

Descriptive statistics

Standard deviation65.17671378
Coefficient of variation (CV)1.85954301
Kurtosis14.16339463
Mean35.04985549
Median Absolute Deviation (MAD)13
Skewness3.520166994
Sum97018
Variance4248.004019
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
04924.9%
 
11521.5%
 
301041.0%
 
28981.0%
 
2890.9%
 
14800.8%
 
7760.8%
 
4710.7%
 
11700.7%
 
3660.7%
 
Other values (249)147014.7%
 
(Missing)723172.3%
 
ValueCountFrequency (%) 
04924.9%
 
11521.5%
 
2890.9%
 
3660.7%
 
4710.7%
 
ValueCountFrequency (%) 
4751< 0.1%
 
4651< 0.1%
 
4611< 0.1%
 
4591< 0.1%
 
4561< 0.1%
 

D6
Real number (ℝ)

HIGH CORRELATION
MISSING
ZEROS

Distinct122
Distinct (%)24.8%
Missing9508
Missing (%)95.1%
Infinite0
Infinite (%)0.0%
Mean74.74338086
Minimum-83
Maximum695
Zeros314
Zeros (%)3.1%
Memory size78.1 KiB

Quantile statistics

Minimum-83
5-th percentile0
Q10
median0
Q349.5
95-th percentile363.5
Maximum695
Range778
Interquartile range (IQR)49.5

Descriptive statistics

Standard deviation148.9159718
Coefficient of variation (CV)1.992363339
Kurtosis4.031551549
Mean74.74338086
Median Absolute Deviation (MAD)0
Skewness2.129327742
Sum36699
Variance22175.96667
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
03143.1%
 
180.1%
 
270.1%
 
35760.1%
 
354< 0.1%
 
3384< 0.1%
 
2934< 0.1%
 
263< 0.1%
 
553< 0.1%
 
43< 0.1%
 
Other values (112)1351.4%
 
(Missing)950895.1%
 
ValueCountFrequency (%) 
-831< 0.1%
 
03143.1%
 
180.1%
 
270.1%
 
33< 0.1%
 
ValueCountFrequency (%) 
6951< 0.1%
 
6911< 0.1%
 
6831< 0.1%
 
6821< 0.1%
 
6571< 0.1%
 

D7
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct82
Distinct (%)36.8%
Missing9776
Missing (%)97.8%
Infinite0
Infinite (%)0.0%
Mean52.30493274
Minimum0
Maximum407
Zeros73
Zeros (%)0.7%
Memory size78.1 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median5
Q339.5
95-th percentile336.2
Maximum407
Range407
Interquartile range (IQR)39.5

Descriptive statistics

Standard deviation98.95647566
Coefficient of variation (CV)1.891914787
Kurtosis3.568293163
Mean52.30493274
Median Absolute Deviation (MAD)5
Skewness2.168986478
Sum11664
Variance9792.384075
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0730.7%
 
1230.2%
 
250.1%
 
450.1%
 
850.1%
 
34< 0.1%
 
134< 0.1%
 
74< 0.1%
 
184< 0.1%
 
163< 0.1%
 
Other values (72)930.9%
 
(Missing)977697.8%
 
ValueCountFrequency (%) 
0730.7%
 
1230.2%
 
250.1%
 
34< 0.1%
 
450.1%
 
ValueCountFrequency (%) 
4071< 0.1%
 
3671< 0.1%
 
3651< 0.1%
 
3641< 0.1%
 
3601< 0.1%
 

D8
Real number (ℝ≥0)

MISSING

Distinct665
Distinct (%)60.5%
Missing8899
Missing (%)89.0%
Infinite0
Infinite (%)0.0%
Mean144.8621209
Minimum0
Maximum1123.916626
Zeros15
Zeros (%)0.2%
Memory size78.1 KiB

Quantile statistics

Minimum0
5-th percentile0.125
Q11.666666031
median38.14583206
Q3199.2812461
95-th percentile764.608313
Maximum1123.916626
Range1123.916626
Interquartile range (IQR)197.6145801

Descriptive statistics

Standard deviation225.7894857
Coefficient of variation (CV)1.558650973
Kurtosis3.486766277
Mean144.8621209
Median Absolute Deviation (MAD)37.58333257
Skewness2.032820566
Sum159348.333
Variance50980.89186
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0.041666001250.3%
 
0.708333015220.2%
 
0.916665971200.2%
 
0.791665971170.2%
 
0.875150.2%
 
0150.2%
 
1.666666031150.2%
 
0.583333015140.1%
 
0.083333001140.1%
 
0.833333015140.1%
 
Other values (655)9299.3%
 
(Missing)889989.0%
 
ValueCountFrequency (%) 
0150.2%
 
0.041666001250.3%
 
0.083333001140.1%
 
0.12580.1%
 
0.166666001110.1%
 
ValueCountFrequency (%) 
1123.9166261< 0.1%
 
1122.7083741< 0.1%
 
1122.6251< 0.1%
 
872.8751< 0.1%
 
872.8333131< 0.1%
 

D9
Real number (ℝ≥0)

MISSING

Distinct24
Distinct (%)2.2%
Missing8899
Missing (%)89.0%
Infinite0
Infinite (%)0.0%
Mean0.5093936
Minimum0
Maximum0.958333015
Zeros75
Zeros (%)0.8%
Memory size78.1 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10.125
median0.625
Q30.791665971
95-th percentile0.916665971
Maximum0.958333015
Range0.958333015
Interquartile range (IQR)0.666665971

Descriptive statistics

Standard deviation0.3287476449
Coefficient of variation (CV)0.6453705835
Kurtosis-1.450174168
Mean0.5093936
Median Absolute Deviation (MAD)0.25
Skewness-0.3283162134
Sum560.33296
Variance0.108075014
MonotocityNot monotonic
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%) 
0.041666001820.8%
 
0.791665971770.8%
 
0750.8%
 
0.708333015720.7%
 
0.083333001700.7%
 
0.666665971670.7%
 
0.875630.6%
 
0.583333015620.6%
 
0.833333015620.6%
 
0.75600.6%
 
Other values (14)4104.1%
 
(Missing)889989.0%
 
ValueCountFrequency (%) 
0750.8%
 
0.041666001820.8%
 
0.083333001700.7%
 
0.125580.6%
 
0.166666001450.5%
 
ValueCountFrequency (%) 
0.958333015540.5%
 
0.916665971550.6%
 
0.875630.6%
 
0.833333015620.6%
 
0.791665971770.8%
 

D10
Real number (ℝ≥0)

MISSING
ZEROS

Distinct501
Distinct (%)5.7%
Missing1265
Missing (%)12.7%
Infinite0
Infinite (%)0.0%
Mean127.2858942
Minimum0
Maximum695
Zeros3394
Zeros (%)33.9%
Memory size78.1 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median29
Q3233
95-th percentile475
Maximum695
Range695
Interquartile range (IQR)233

Descriptive statistics

Standard deviation166.680034
Coefficient of variation (CV)1.309493365
Kurtosis-0.3815108148
Mean127.2858942
Median Absolute Deviation (MAD)29
Skewness1.060594249
Sum1111715
Variance27782.23373
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0339433.9%
 
1900.9%
 
28760.8%
 
30680.7%
 
483610.6%
 
29500.5%
 
484480.5%
 
485480.5%
 
3440.4%
 
455430.4%
 
Other values (491)481248.1%
 
(Missing)126512.7%
 
ValueCountFrequency (%) 
0339433.9%
 
1900.9%
 
2370.4%
 
3440.4%
 
4300.3%
 
ValueCountFrequency (%) 
6951< 0.1%
 
6571< 0.1%
 
6552< 0.1%
 
5431< 0.1%
 
5201< 0.1%
 

D11
Real number (ℝ)

MISSING
ZEROS

Distinct442
Distinct (%)19.4%
Missing7723
Missing (%)77.2%
Infinite0
Infinite (%)0.0%
Mean157.0144991
Minimum-33
Maximum488
Zeros667
Zeros (%)6.7%
Memory size78.1 KiB

Quantile statistics

Minimum-33
5-th percentile0
Q10
median90
Q3301
95-th percentile454
Maximum488
Range521
Interquartile range (IQR)301

Descriptive statistics

Standard deviation164.419876
Coefficient of variation (CV)1.04716365
Kurtosis-1.141878358
Mean157.0144991
Median Absolute Deviation (MAD)90
Skewness0.6065493462
Sum357365
Variance27033.89561
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
06676.7%
 
455160.2%
 
183160.2%
 
13150.2%
 
28150.2%
 
423150.2%
 
427150.2%
 
426140.1%
 
30130.1%
 
1130.1%
 
Other values (432)147714.8%
 
(Missing)772377.2%
 
ValueCountFrequency (%) 
-331< 0.1%
 
-151< 0.1%
 
-131< 0.1%
 
06676.7%
 
1130.1%
 
ValueCountFrequency (%) 
4881< 0.1%
 
4851< 0.1%
 
4843< 0.1%
 
4834< 0.1%
 
4821< 0.1%
 

D12
Real number (ℝ)

HIGH CORRELATION
MISSING
ZEROS

Distinct93
Distinct (%)22.9%
Missing9593
Missing (%)95.9%
Infinite0
Infinite (%)0.0%
Mean53.74630542
Minimum-83
Maximum479
Zeros275
Zeros (%)2.8%
Memory size78.1 KiB

Quantile statistics

Minimum-83
5-th percentile0
Q10
median0
Q313.75
95-th percentile345.75
Maximum479
Range562
Interquartile range (IQR)13.75

Descriptive statistics

Standard deviation114.5782052
Coefficient of variation (CV)2.131834073
Kurtosis2.700489248
Mean53.74630542
Median Absolute Deviation (MAD)0
Skewness2.019815328
Sum21821
Variance13128.16511
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
02752.8%
 
170.1%
 
260.1%
 
35750.1%
 
354< 0.1%
 
3384< 0.1%
 
2934< 0.1%
 
33< 0.1%
 
263< 0.1%
 
43< 0.1%
 
Other values (83)920.9%
 
(Missing)959395.9%
 
ValueCountFrequency (%) 
-831< 0.1%
 
02752.8%
 
170.1%
 
260.1%
 
33< 0.1%
 
ValueCountFrequency (%) 
4791< 0.1%
 
4711< 0.1%
 
4701< 0.1%
 
4551< 0.1%
 
3981< 0.1%
 

D13
Real number (ℝ≥0)

MISSING
ZEROS

Distinct36
Distinct (%)12.9%
Missing9719
Missing (%)97.2%
Infinite0
Infinite (%)0.0%
Mean20.79642857
Minimum0
Maximum367
Zeros231
Zeros (%)2.3%
Memory size78.1 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile229
Maximum367
Range367
Interquartile range (IQR)0

Descriptive statistics

Standard deviation67.16073365
Coefficient of variation (CV)3.22943593
Kurtosis11.92075082
Mean20.79642857
Median Absolute Deviation (MAD)0
Skewness3.562301271
Sum5823
Variance4510.564145
MonotocityNot monotonic
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%) 
02312.3%
 
14< 0.1%
 
2294< 0.1%
 
23< 0.1%
 
33< 0.1%
 
602< 0.1%
 
472< 0.1%
 
2952< 0.1%
 
132< 0.1%
 
1221< 0.1%
 
Other values (26)260.3%
 
(Missing)971997.2%
 
ValueCountFrequency (%) 
02312.3%
 
14< 0.1%
 
23< 0.1%
 
33< 0.1%
 
71< 0.1%
 
ValueCountFrequency (%) 
3671< 0.1%
 
3571< 0.1%
 
3121< 0.1%
 
3051< 0.1%
 
3031< 0.1%
 

D14
Real number (ℝ)

MISSING
ZEROS

Distinct84
Distinct (%)18.7%
Missing9549
Missing (%)95.5%
Infinite0
Infinite (%)0.0%
Mean45.49777778
Minimum-193
Maximum696
Zeros341
Zeros (%)3.4%
Memory size78.1 KiB

Quantile statistics

Minimum-193
5-th percentile0
Q10
median0
Q30
95-th percentile334.75
Maximum696
Range889
Interquartile range (IQR)0

Descriptive statistics

Standard deviation118.2773579
Coefficient of variation (CV)2.599629338
Kurtosis8.288516436
Mean45.49777778
Median Absolute Deviation (MAD)0
Skewness2.806387859
Sum20474
Variance13989.5334
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
03413.4%
 
8470.1%
 
250.1%
 
504< 0.1%
 
13< 0.1%
 
33< 0.1%
 
3612< 0.1%
 
712< 0.1%
 
2502< 0.1%
 
-1932< 0.1%
 
Other values (74)790.8%
 
(Missing)954995.5%
 
ValueCountFrequency (%) 
-1932< 0.1%
 
-831< 0.1%
 
03413.4%
 
13< 0.1%
 
250.1%
 
ValueCountFrequency (%) 
6961< 0.1%
 
6931< 0.1%
 
5981< 0.1%
 
5951< 0.1%
 
5431< 0.1%
 

D15
Real number (ℝ)

MISSING
ZEROS

Distinct507
Distinct (%)10.0%
Missing4930
Missing (%)49.3%
Infinite0
Infinite (%)0.0%
Mean196.1570329
Minimum-83
Maximum695
Zeros1184
Zeros (%)11.8%
Memory size78.1 KiB

Quantile statistics

Minimum-83
5-th percentile0
Q14
median147
Q3391
95-th percentile481
Maximum695
Range778
Interquartile range (IQR)387

Descriptive statistics

Standard deviation184.931821
Coefficient of variation (CV)0.9427743589
Kurtosis-1.438556399
Mean196.1570329
Median Absolute Deviation (MAD)147
Skewness0.3773596425
Sum994320
Variance34199.77841
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0118411.8%
 
456560.6%
 
455560.6%
 
485530.5%
 
484490.5%
 
483470.5%
 
479430.4%
 
475370.4%
 
1350.4%
 
482350.4%
 
Other values (497)347434.7%
 
(Missing)493049.3%
 
ValueCountFrequency (%) 
-831< 0.1%
 
-601< 0.1%
 
-301< 0.1%
 
-151< 0.1%
 
-131< 0.1%
 
ValueCountFrequency (%) 
6951< 0.1%
 
6651< 0.1%
 
6571< 0.1%
 
6262< 0.1%
 
5911< 0.1%
 

M1
Categorical

MISSING

Distinct1
Distinct (%)< 0.1%
Missing5790
Missing (%)57.9%
Memory size78.1 KiB
T
4209 
ValueCountFrequency (%) 
T420942.1%
 
(Missing)579057.9%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length3
Median length3
Mean length2.158115812
Min length1

M2
Boolean

MISSING

Distinct2
Distinct (%)< 0.1%
Missing5790
Missing (%)57.9%
Memory size78.1 KiB
T
3796 
F
 
413
(Missing)
5790 
ValueCountFrequency (%) 
T379638.0%
 
F4134.1%
 
(Missing)579057.9%
 

M3
Boolean

MISSING

Distinct2
Distinct (%)< 0.1%
Missing5790
Missing (%)57.9%
Memory size78.1 KiB
T
3290 
F
919 
(Missing)
5790 
ValueCountFrequency (%) 
T329032.9%
 
F9199.2%
 
(Missing)579057.9%
 

M4
Categorical

MISSING

Distinct3
Distinct (%)0.1%
Missing4886
Missing (%)48.9%
Memory size78.1 KiB
M0
3582 
M1
872 
M2
659 
ValueCountFrequency (%) 
M0358235.8%
 
M18728.7%
 
M26596.6%
 
(Missing)488648.9%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length3
Median length2
Mean length2.488648865
Min length2

M5
Boolean

MISSING

Distinct2
Distinct (%)< 0.1%
Missing5793
Missing (%)57.9%
Memory size78.1 KiB
F
2263 
T
1943 
(Missing)
5793 
ValueCountFrequency (%) 
F226322.6%
 
T194319.4%
 
(Missing)579357.9%
 

M6
Boolean

MISSING

Distinct2
Distinct (%)< 0.1%
Missing2572
Missing (%)25.7%
Memory size78.1 KiB
F
4212 
T
3215 
(Missing)
2572 
ValueCountFrequency (%) 
F421242.1%
 
T321532.2%
 
(Missing)257225.7%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

TransactionIDisFraudTransactionDTTransactionAmtProductCDcard1card2card3card4card5card6addr2dist1dist2P_emaildomainR_emaildomainC1C2C3C4C5C6C7C8C9C10C11C12C13C14D1D2D3D4D5D6D7D8D9D10D11D12D13D14D15M1M2M3M4M5M6
0298700008640068.5W13926NaN150.0discover142.0credit87.019.0NaNNaNNaN1100010010201114NaN13.0NaNNaNNaNNaNNaNNaN13.013.0NaNNaNNaN0.0TTTM2FT
1298700108640129.0W2755404.0150.0mastercard102.0credit87.0NaNNaNgmail.comNaN110001000010110NaNNaN0.0NaNNaNNaNNaNNaN0.0NaNNaNNaNNaN0.0NaNNaNNaNM0TT
2298700208646959.0W4663490.0150.0visa166.0debit87.0287.0NaNoutlook.comNaN110001001010110NaNNaN0.0NaNNaNNaNNaNNaN0.0315.0NaNNaNNaN315.0TTTM0FF
3298700308649950.0W18132567.0150.0mastercard117.0debit87.0NaNNaNyahoo.comNaN250004001010251112112.00.094.00.0NaNNaNNaNNaN84.0NaNNaNNaNNaN111.0NaNNaNNaNM0TF
4298700408650650.0H4497514.0150.0mastercard102.0credit87.0NaNNaNgmail.comNaN110001010110110NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
5298700508651049.0W5937555.0150.0visa226.0debit87.036.0NaNgmail.comNaN110001001010110NaNNaN0.0NaNNaNNaNNaNNaN0.00.0NaNNaNNaN0.0TTTM1FT
62987006086522159.0W12308360.0150.0visa166.0debit87.00.0NaNyahoo.comNaN110001001010110NaNNaN0.0NaNNaNNaNNaNNaN0.00.0NaNNaNNaN0.0TTTM0FF
72987007086529422.5W12695490.0150.0visa226.0debit87.0NaNNaNmail.comNaN110001000010110NaNNaN0.0NaNNaNNaNNaNNaN0.0NaNNaNNaNNaN0.0NaNNaNNaNM0FF
8298700808653515.0H2803100.0150.0visa226.0debit87.0NaNNaNanonymous.comNaN110001010110110NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
92987009086536117.0W17399111.0150.0mastercard224.0debit87.019.0NaNyahoo.comNaN2200030030101226161.030.0318.030.0NaNNaNNaNNaN40.0302.0NaNNaNNaN318.0TTTM0TT

Last rows

TransactionIDisFraudTransactionDTTransactionAmtProductCDcard1card2card3card4card5card6addr2dist1dist2P_emaildomainR_emaildomainC1C2C3C4C5C6C7C8C9C10C11C12C13C14D1D2D3D4D5D6D7D8D9D10D11D12D13D14D15M1M2M3M4M5M6
99892996989031295031.950W1546111.0150.0visa226.0debit87.04.0NaNanonymous.comNaN116980016281006806905921029090.030.0182.030.0NaNNaNNaNNaN487.0275.0NaNNaNNaN487.0TTTNaNNaNT
99902996990031297229.000W12741106.0150.0visa226.0debit87.07.0NaNgmail.comNaN913001100070701166205205.00.0NaNNaNNaNNaNNaNNaN136.0NaNNaNNaNNaNNaNTTTM0TF
99912996991031303020.000H6924555.0150.0mastercard117.0debit87.0NaNNaNhotmail.comNaN110001010110110NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
999229969920313042107.950W7919194.0150.0mastercard166.0debit87.02.0NaNNaNNaN161300141100801103513156156.029.0NaNNaNNaNNaNNaNNaN277.0NaNNaNNaNNaNNaNTTFNaNNaNF
99932996993031304450.000H3998399.0150.0american express223.0credit87.0NaNNaNyahoo.comgmail.com110101010110110NaNNaNNaNNaNNaNNaN107.5833360.583333NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
99942996994031305879.754C7794266.0185.0mastercard102.0creditNaNNaNNaNgmail.comgmail.com110101110111110NaNNaNNaNNaNNaNNaNNaNNaN0.0NaNNaNNaNNaNNaNNaNNaNNaNM0NaNNaN
99952996995031306340.000H13052254.0150.0visa226.0debit87.0NaNNaNgmail.comNaN110001010110110NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
99962996996031306857.000W11137NaN150.0visa226.0debit87.00.0NaNmail.comNaN523900113400320370120340NaNNaNNaNNaNNaNNaNNaNNaN349.0340.0NaNNaNNaNNaNTTTM0TF
999729969970313099108.950W15627239.0150.0mastercard224.0debit87.07.0NaNNaNNaN121200101200901003310101101.013.0NaNNaNNaNNaNNaNNaN101.044.0NaNNaNNaN101.0TTTNaNNaNF
999829969980313110160.950W7207111.0150.0visa226.0debit87.0NaNNaNgmail.comNaN110011000010212525.025.025.025.0NaNNaNNaNNaN25.0NaNNaNNaNNaN25.0NaNNaNNaNNaNNaNF